Show/Hide Code
using Pumas
using PumasUtilities
using DataFramesMeta
using CSV
using SummaryTables
using CairoMakie
using AlgebraOfGraphics
using RandomA Hands-On Journey Through Pharmacometric Analysis with Pumas
using Pumas
using PumasUtilities
using DataFramesMeta
using CSV
using SummaryTables
using CairoMakie
using AlgebraOfGraphics
using Randomrng = Random.seed!(5438)Welcome to this comprehensive, hands-on tutorial that demonstrates the complete Pumas workflow for continuous data modeling. This appendix accompanies the main article by providing details needed to reproduce the analysis from scratch—every line of code, every modeling decision, and every diagnostic check.
This tutorial walks through the PKPD workflow, from raw clinical trial data through to simulation-based dose recommendations. This document demonstrates how Pumas integrates each stage of the workflow:
Data Preparation → Exploratory Analysis → Non-Compartmental Analysis → Population PK Modeling → PKPD Integration → Simulation & Decision Support
Along the way, you’ll discover not just what to do, but why each step matters for robust drug development decisions.
This document is designed to be:
First-time learners: Read sequentially, executing each code block to build intuition
Experienced pharmacometricians: Jump to specific sections using the table of contents—each analysis stage is self-contained
Reference seekers: Use section cross-references to find detailed implementations
Our analysis follows the pharmacometric workflow presented in the main article (Figure 1). The appendix sections map to these stages:
| Stage | Description | Appendix Section |
|---|---|---|
| 1 | Data Preparation & Quality Checks | A.2 (#sec-data-prep) |
| 2 | Exploratory Data Analysis | A.2 (#sec-data-prep) |
| 3 | Non-Compartmental Analysis | A.3 (#sec-nca) |
| 4-8 | Population PK Model Development | A.4 (#sec-pop-pk) |
| 9-10 | Parameter Inference & Uncertainty | A.5 (#sec-uncertainty) |
| 11 | Pharmacodynamic Modeling | A.6 (#sec-pkpd) |
| 12 | Simulation for Decision Support | A.7 (#sec-simulation) |
Let’s begin our journey with the most fundamental step: preparing our data for analysis.
In any pharmacometric analysis, data preparation is where rigor begins. Before we fit a single model, we must ensure our data is clean, well-structured, and ready for analysis. This section demonstrates how Pumas transforms raw clinical trial data into analysis-ready population objects while maintaining data integrity at every step.
Our journey begins with a CSV file exported from the clinical database—the standard format for pharmacometric analyses. Let’s load it and take our first look:
path = joinpath(@__DIR__, "data", "intro_paper_data.csv")
wide_data = CSV.read(path, DataFrame)
first(wide_data, 5)| Row | ID | NOMTIME | PROFTIME | PROFDAY | AMT | EVID | CMT | ROUTE | CONC | PD_CONC | SEX | WEIGHTB | DOSE | TRTACT |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| String7 | Float64 | Float64 | Float64 | Float64 | Int64 | String7? | String3 | Float64? | Float64? | String7 | Float64 | Int64 | String7 | |
| 1 | 0_1 | -0.1 | 0.0 | 0.0 | 0.0 | 0 | missing | ev | 0.0 | 49.19 | Male | 83.31 | 0 | Placebo |
| 2 | 0_1 | 0.0 | 0.0 | 1.0 | 0.0 | 1 | Depot | ev | missing | missing | Male | 83.31 | 0 | Placebo |
| 3 | 0_1 | 0.1 | 0.1 | 1.0 | 0.0 | 0 | missing | ev | 0.0 | missing | Male | 83.31 | 0 | Placebo |
| 4 | 0_1 | 0.5 | 0.5 | 1.0 | 0.0 | 0 | missing | ev | 0.0 | missing | Male | 83.31 | 0 | Placebo |
| 5 | 0_1 | 1.0 | 1.0 | 1.0 | 0.0 | 0 | missing | ev | 0.0 | missing | Male | 83.31 | 0 | Placebo |
This dataset follows the “wide format” convention where each observation type has its own column (CONC for PK, PD_CONC for PD). This differs from “long format” where all observations share a single column with a type identifier.
Why wide format matters: Pumas requires wide format. To read more on the data format requirements, see the Data Representation Tutorial.
At this stage, we’re looking for red flags:
The first() function gives us a quick preview. Everything looks reasonable—let’s proceed to create our analysis populations.
Here’s a key Pumas philosophy: One dataset, multiple populations. Different analyses require different data subsets and structures. We’ll create three populations, each optimized for its intended use:
Non-compartmental analysis requires complete concentration-time profiles without gaps. We filter strategically:
df_nca = deepcopy(wide_data)
# Filter to relevant rows: non-negative nominal time, active doses only, PK data only
@rsubset! df_nca begin
:NOMTIME >= 0
:DOSE > 0
:PROFDAY ∈ [1, 6] # First dose and steady state
end
# Transform nominal time from hours to days and add route information
@rtransform! df_nca begin
:NOMTIME = :NOMTIME / 24 # NCA functions expect days
:ROUTE = "ev" # extravascular (oral) administration
end
# Define units for NCA analysis
timeu = u"d" # days
concu = u"ng/mL" # nanograms per milliliter
amtu = u"mg" # milligramsNow we create the NCA population object:
population_nca = read_nca(
df_nca;
id=:ID,
time=:PROFTIME, # Time since last dose
observations=:CONC,
amt=:AMT,
route=:ROUTE,
group=[:PROFDAY, :DOSE] # Stratify by day and dose for comparisons
)NCAPopulation (50 subjects):
Group: [["PROFDAY" => 1.0, "DOSE" => 100], ["PROFDAY" => 1.0, "DOSE" => 200], ["PROFDAY" => 1.0, "DOSE" => 400], ["PROFDAY" => 1.0, "DOSE" => 800], ["PROFDAY" => 1.0, "DOSE" => 1600], ["PROFDAY" => 6.0, "DOSE" => 100], ["PROFDAY" => 6.0, "DOSE" => 200], ["PROFDAY" => 6.0, "DOSE" => 400], ["PROFDAY" => 6.0, "DOSE" => 800], ["PROFDAY" => 6.0, "DOSE" => 1600]]
Number of missing observations: 100
Number of blq observations: 0
The group argument tells Pumas we want to compare NCA metrics across dose levels and between Day 1 vs. Day 6.
For nonlinear mixed-effects modeling, we need the full time course including dosing events:
# population with placebo data (for covariate balance checks)
population_placebo = read_pumas(
wide_data;
id=:ID,
time=:NOMTIME,
observations=[:CONC, :PD_CONC], # Multi-endpoint: both PK and PD
amt=:AMT,
cmt=:CMT, # Compartment for dosing
evid=:EVID, # Event ID: 1=dose, 0=observation
covariates=[:WEIGHTB, :SEX, :TRTACT, :DOSE]
)
# population without placebo data (for PKPD model fitting)
population = read_pumas(
filter(r -> r.DOSE != 0, wide_data); # Exclude placebo
id=:ID,
time=:NOMTIME,
observations=[:CONC, :PD_CONC],
amt=:AMT,
cmt=:CMT,
evid=:EVID,
covariates=[:WEIGHTB, :SEX, :TRTACT, :DOSE]
)┌ Warning: Your dataset has bolus doses with zero dose amount. └ @ Pumas none:2327
Population
Subjects: 50
Covariates: WEIGHTB, SEX, TRTACT, DOSE
Observations: CONC, PD_CONC
With placebo (population_placebo): Use this to verify demographic balance across treatment groups. Imbalances in sex or weight distributions could confound covariate analyses.
Without placebo (population): Use this for PKPD model fitting. Placebo subjects have zero drug concentration—they don’t inform PK parameters and can cause numerical issues during estimation.
When you call read_pumas(), Pumas validates your data immediately:
A comprehensive list is provided in the Data Representation Tutorial.
If anything is wrong, you get a clear error message pointing to the specific problem—before you waste hours on model fitting.
This is input validation at work: catch errors early when they’re easy to fix, not after overnight estimation runs.
Before moving to exploratory analysis, verify:
Next step: Now that our data is properly structured, let’s explore it visually to understand what we’re modeling.
Before fitting models, it’s essential to understand the structure and characteristics of the data. This corresponds to Stage 2 in the workflow map (Figure 1). Exploratory data analysis (EDA) helps identify potential issues (missing data, outliers, data entry errors), understand covariate distributions, and visualize concentration-time profiles.
Pumas provides flexible tools for EDA that work with both raw DataFrames and structured Population objects:
Summary Tables
SummaryTables.jl functions like overview_table(), table_one(), listingtable() and summarytable()Visualizations
Can be created from either DataFrames or Population objects:
AlgebraOfGraphics.jl + CairoMakie.jl for full customizationPumasUtilities.jl plotting functions for pharmacometric-specific visualizationsPumasUtilities Plotting Functions
Population (NLME) and NCAPopulation (NCA) objectsobservations_vs_time()), while others are analysis-specific (e.g., summary_observations_vs_time() for the case of NCA)AlgebraOfGraphics.jl, so plots can be easily extended and customized using the same syntaxWorkflow Strategy
In this section, we demonstrate both approaches to explore PK and PD endpoints and understand covariate relationships. We’ll clearly indicate whether we’re using [DataFrame] or [Population] for each example, allowing you to choose the approach that best fits your workflow. Generally:
One useful function to ensure the data was read correctly and to understand its structure is overview_table() from the SummaryTables.jl package:
overview_table(wide_data)| No | Variable | Stats / Values | Freqs (% of Valid) | Graph | Valid | Missing |
| 1 | ID [String7] |
1. 0_6 2. 1600_8 3. 100_5 4. 1600_7 5. 800_9 6. 800_10 7. 0_3 8. 0_1 9. 400_1 10. 0_7 [50 others] |
33 (1.7%) 33 (1.7%) 33 (1.7%) 33 (1.7%) 33 (1.7%) 33 (1.7%) 33 (1.7%) 33 (1.7%) 33 (1.7%) 33 (1.7%) 1650 (83.3%) |
1980 (100%) |
0 (0%) |
|
| 2 | NOMTIME [Float64] |
Mean (sd): 80.2 (64) min ≤ med ≤ max: -0.1 ≤ 95.9 ≤ 216 IQR (CV): 112 (0.798) |
33 distinct values | 1980 (100%) |
0 (0%) |
|
| 3 | PROFTIME [Float64] |
Mean (sd): 14.7 (21.6) min ≤ med ≤ max: 0 ≤ 4 ≤ 95.9 IQR (CV): 23.8 (1.46) |
14 distinct values | 1980 (100%) |
0 (0%) |
|
| 4 | PROFDAY [Float64] |
Mean (sd): 3.91 (2.56) min ≤ med ≤ max: 0 ≤ 4 ≤ 9 IQR (CV): 5 (0.656) |
10 distinct values | 1980 (100%) |
0 (0%) |
|
| 5 | AMT [Float64] |
Mean (sd): 93.9 (308) min ≤ med ≤ max: 0 ≤ 0 ≤ 1600 IQR (CV): 0 (3.27) |
6 distinct values | 1980 (100%) |
0 (0%) |
|
| 6 | EVID [Int64] |
Mean (sd): 0.182 (0.386) min ≤ med ≤ max: 0 ≤ 0 ≤ 1 IQR (CV): 0 (2.12) |
2 distinct values | 1980 (100%) |
0 (0%) |
|
| 7 | CMT [String7?] |
1. Depot | 360 (100%) | 360 (18.2%) |
1620 (81.8%) |
|
| 8 | ROUTE [String3] |
1. ev | 1980 (100%) | 1980 (100%) |
0 (0%) |
|
| 9 | CONC [Float64?] |
Mean (sd): 9.28 (13.6) min ≤ med ≤ max: 0 ≤ 4.06 ≤ 110 IQR (CV): 10.1 (1.46) |
947 distinct values | 1620 (81.8%) |
360 (18.2%) |
|
| 10 | PD_CONC [Float64?] |
Mean (sd): 72.1 (29.4) min ≤ med ≤ max: 43.6 ≤ 58.8 ≤ 198 IQR (CV): 30.4 (0.407) |
1005 distinct values | 1140 (57.6%) |
840 (42.4%) |
|
| 11 | SEX [String7] |
1. Female 2. Male |
1056 (53.3%) 924 (46.7%) |
1980 (100%) |
0 (0%) |
|
| 12 | WEIGHTB [Float64] |
Mean (sd): 79.6 (5.89) min ≤ med ≤ max: 69 ≤ 78.2 ≤ 93.2 IQR (CV): 8.67 (0.0739) |
60 distinct values | 1980 (100%) |
0 (0%) |
|
| 13 | DOSE [Int64] |
Mean (sd): 517 (549) min ≤ med ≤ max: 0 ≤ 300 ≤ 1600 IQR (CV): 700 (1.06) |
6 distinct values | 1980 (100%) |
0 (0%) |
|
| 14 | TRTACT [String7] |
1. 200mg 2. Placebo 3. 400mg 4. 100mg 5. 1600mg 6. 800mg |
330 (16.7%) 330 (16.7%) 330 (16.7%) 330 (16.7%) 330 (16.7%) 330 (16.7%) |
1980 (100%) |
0 (0%) |
|
| Dimensions: 1980 x 14 Duplicate rows: 0 | ||||||
Comprehensive data quality overview showing column types, summary statistics, unique value counts, distribution visualizations, and missing data proportions for all variables in the analysis dataset.
As we can see, it provides:
This single function gives a comprehensive overview of the entire dataset, making it easy to spot issues like unexpected missing data, incorrect data types, or unusual value ranges.
For demographic and covariate information, we create stratified summaries. The table_one() function provides publication-ready summary tables stratified by grouping variables:
table_one(
unique(wide_data, :ID),
[:WEIGHTB => "Baseline Weight [kg]", :SEX => "Sex [Female/Male]"],
groupby=:SEX => "Sex",
show_n=true
)| Sex | |||
| Total (n=60) |
Female (n=32) |
Male (n=28) |
|
| Baseline Weight [kg] | |||
| Mean (SD) | 79.6 (5.93) | 75.3 (2.67) | 84.6 (4.65) |
| Median [Min, Max] | 78.2 [69, 93.2] | 75.8 [69, 79.7] | 84 [74.9, 93.2] |
| Sex [Female/Male] | |||
| Female | 32 (53.3%) | 32 (100%) | 0 (0%) |
| Male | 28 (46.7%) | 0 (0%) | 28 (100%) |
This produces a “Table 1” style summary commonly used in clinical publications, showing:
For more customized summary tables, including tables stratified by multiple variables, custom summary functions, or complex formatting, see the documentation for SummaryTables.jl package that offers extensive customization options for publication-quality tables.
Visual exploration of concentration-time profiles is essential for understanding PK characteristics including dose proportionality, accumulation, and between-subject variability.
We begin by visualizing the average concentration-time profile for each dose level. This helps identify overall trends and assess whether the study design captured key PK features (absorption, distribution, elimination).
First, we need to load the plotting packages:
using PumasUtilities
using AlgebraOfGraphics
using CairoMakieFor multiple ascending dose studies, comparing first dose versus steady-state profiles reveals drug accumulation patterns.
# Create mean concentration profile by dose and time
df_mean = @chain wide_data begin
@rsubset :DOSE != 0
@by [:PROFDAY, :TRTACT, :PROFTIME] begin
:Cmean = mean(:CONC)
:Cmedian = median(:CONC)
end
end
@rsubset! df_mean :PROFDAY ∈ [1, 6]
# Plot using AlgebraOfGraphics
dose_renamer = renamer(map(d -> d => "$(Int(d)) mg", unique(wide_data.DOSE)))
day_renamer = renamer(map(d -> d => "Day $(Int(d))", unique(df_mean.PROFDAY)))# First dosing interval (e.g., Day 1: 0-24 hours)
plt_by_day = draw(data(@rsubset(df_mean, !ismissing(:Cmean))) *
mapping(:PROFTIME, :Cmean; color=:TRTACT => "Dose",
layout=:PROFDAY => day_renamer) *
visual(Lines, linewidth=6),
axis=(;
xlabel="Time (hours)",
ylabel="Mean Concentration (ng/mL)",
yscale=log10,
ytickformat=x -> string.(round.(x; digits=1)),
yticks=[0.1, 1, 2, 4, 10, 20, 40, 80, 100],
ygridwidth=3,
yminorgridcolor=:lightgrey,
yminorticksvisible=true,
yminorgridvisible=true,
yminorticks=IntervalsBetween(10),
),
legend=(; position=:bottom),
figure=(; size=(1600, 800), fontsize=28)
)or one can change the visualization to observe different doses in each panel and the accumulation as a group
# First dosing interval (e.g., Day 1: 0-24 hours)
plt_by_dose = draw(data(@rsubset(df_mean, !ismissing(:Cmean))) *
mapping(:PROFTIME, :Cmean; color=:PROFDAY => day_renamer,
layout=:TRTACT => "Dose") *
visual(Lines, linewidth=6),
axis=(;
xlabel="Time (hours)",
ylabel="Mean Concentration (ng/mL)",
yscale=log10,
ytickformat=x -> string.(round.(x; digits=1)),
yticks=[0.1, 1, 2, 4, 10, 20, 40, 80, 100],
ygridwidth=3,
yminorgridcolor=:lightgrey,
yminorticksvisible=true,
yminorgridvisible=true,
yminorticks=IntervalsBetween(5),
),
legend=(; position=:bottom),
figure=(; size=(1600, 1200), fontsize=28)
)Comparing these two panels helps identify accumulation—if steady-state concentrations are substantially higher than first-dose concentrations, the drug is accumulating.
Spaghetti plots reveal between-subject variability by overlaying individual profiles (gray) with population mean (red).
# Add identifier for distinguishing data vs mean
df_mean = @rsubset df_mean !ismissing(:Cmean)
df_conc = @chain copy(wide_data) begin
@rsubset :DOSE != 0
@rsubset !ismissing(:CONC)
@rsubset :PROFDAY ∈ [1, 6]
end
df_conc.source .= "Individual"
df_mean.source .= "Mean"
# Create layers
individual_layer = data(df_conc) *
mapping(:PROFTIME, :CONC;
group=:ID => nonnumeric,
row=:PROFDAY => day_renamer,
col=:TRTACT => "Dose") *
visual(Lines; color=(:gray, 0.6), linewidth=0.7)
mean_layer = data(df_mean) *
mapping(:PROFTIME, :Cmean;
row=:PROFDAY => day_renamer,
col=:TRTACT => "Dose") *
visual(Lines; color=:red, linewidth=4)
plt_spgt_mean = draw(
individual_layer + mean_layer;
axis=(;
xlabel="Time (hours)",
ylabel="Concentration (ng/mL)",
yscale=log10,
ytickformat=x -> string.(round.(x; digits=1)),
yticks=[0.1, 1, 2, 4, 10, 20, 40, 60, 100],
ygridwidth=1,
yminorgridcolor=(:lightgrey, 0.5),
yminorticksvisible=true,
yminorgridvisible=true,
yminorticks=IntervalsBetween(2),
xticks=0:4:24,
limits=(nothing, 24, nothing, 100),
),
figure=(; size=(1600, 1200), fontsize=28),
)
# save("./figures/fig2-concvstime.png", plt_spgt_mean; px_per_unit = 2)
plt_spgt_meanThis plot clearly shows the variability across subjects (gray lines) and how the mean profile (red line) represents the central tendency.
For detailed examination of individual profiles, the observations_vs_time() function creates separate panels for each subject, useful for identifying outliers or unusual PK patterns.
using QuartoTools: Tabset
function ovst(pop)
observations_vs_time(
pop;
observations=[:CONC],
paginate=true,
separate=true,
figure=(; size=(1200, 900), fontsize=24),
axis=(;
xlabel="Time (hours)",
ylabel="Concentration (ng/mL)",
ygridwidth=1,
yminorgridcolor=(:lightgrey, 0.5),
yminorticksvisible=true,
yminorgridvisible=true,
yminorticks=IntervalsBetween(2),
),
legend=(; position=:bottom)
)
end
obs_plots = ovst(population)
# Display all using Tabset
Tabset([
"Panel $(id)" => plot for (id, plot) in enumerate(obs_plots)
])Setting paginate = true splits subjects across multiple figure pages, which is essential for datasets with many subjects (>9). Each page can be accessed via indexing (obs_plots[1], obs_plots[2], etc.).
While we demonstrate basic plotting the tutorials in tutorials.pumas.ai provide comprehensive examples using AlgebraOfGraphics.jl and CairoMakie.jl for:
Having characterized PK profiles, we now explore pharmacodynamic endpoints to understand exposure-response relationships, temporal dynamics, and patient variability in drug response.
In addition to characterizing PK profiles, exploratory analysis of pharmacodynamic endpoints is essential for understanding the exposure-response relationship. The wide_data DataFrame already contains matched PK (:CONC) and PD (:PD_CONC) observations at the same timepoints, making it straightforward to explore:
First, let’s create a summary DataFrame for mean PD profiles by dose and time:
# Calculate mean PD metrics by dose and time
df_pd_mean = @chain wide_data begin
@rsubset :NOMTIME >= 0
@rsubset :PROFDAY >= 6
@rsubset !ismissing(:PD_CONC) && :EVID == 0
@by [:DOSE, :PROFDAY, :PROFTIME] begin
:PD_mean = mean(:PD_CONC)
:PD_median = median(:PD_CONC)
:PD_std = std(:PD_CONC)
:PD_n = length(:PD_CONC)
:PD_sem = std(:PD_CONC) / sqrt(length(:PD_CONC))
:PD_time = median(:PROFTIME)
end
@rtransform begin
:lci = :PD_mean - 1.96 * (:PD_sem)
:uci = :PD_mean + 1.96 * (:PD_sem)
end
endNow we can visualize the mean PD time-course profiles colored by dose:
# Mean PD profiles with 95% CI ribbons by dose (Day 6)
plt_pd_time_mean = data(df_pd_mean) *
mapping(
:PD_time => "Time (Days)",
:PD_mean => "Mean PD Response (IU/L)";
color=:DOSE => dose_renamer => "Dose",
) *
(visual(Lines, linewidth=4))
plt_pd_time_ci = data(df_pd_mean) *
mapping(:PD_time => "Time (Days)",
# :PD_mean => "Mean PD Response (IU/L)",
:lci => "Lower CI",
:uci => "Upper CI",
color=:DOSE => dose_renamer => "Dose") *
visual(Band, alpha=0.4);
draw(plt_pd_time_mean + plt_pd_time_ci;
axis=(;
xlabel="Time (Days)",
ylabel="Mean PD Response (IU/L)",
),
legend=(; position=:bottom),
figure=(; size=(1200, 700), fontsize=24)
)These plots reveal the overall shape of the PD time-course and help identify:
Spaghetti plots reveal between-subject variability in PD response magnitude and time-course.
# Prepare data for individual PD plots
df_pd_individual = @chain wide_data begin
@rsubset begin
!ismissing(:PD_CONC)
:EVID == 0
:PROFDAY >= 6
:NOMTIME >= 0
end
dropmissing(:PD_CONC)
end
# Individual layers
individual_pd_layer = data(df_pd_individual) *
mapping(
:PROFTIME => "Time (hours)",
:PD_CONC => "PD Response (IU/L)";
group=:ID => nonnumeric,
# row = :PROFDAY => nonnumeric,# => day_renamer, # with this the plots are empty
layout=:DOSE => dose_renamer => "Dose"
) *
visual(Lines; color=(:gray, 1.0), linewidth=0.7)
# Mean overlay
mean_pd_layer = data(df_pd_mean) *
mapping(
:PD_time => "Time (hours)",
:PD_mean => "PD Response (IU/L)";
# row = :PROFDAY => nonnumeric,# => day_renamer,# with this the plots are empty
layout=:DOSE => dose_renamer => "Dose"
) *
visual(Lines; color=:red, linewidth=4)
draw(
individual_pd_layer + mean_pd_layer;
axis=(;
xlabel="Time (hours)",
ylabel="PD Response (IU/L)",
xticks=0:48:196,
),
figure=(; title="Day 6 PD profiles", size=(1600, 1200), fontsize=28)
)The spaghetti plots clearly show:
A key pharmacometric question is: How does PD response relate to drug concentration? Since wide_data already has matched PK and PD observations, we can directly explore this relationship.
At Steady State (Day 6):
# Filter to steady state observations with both PK and PD
df_er_ss = @chain wide_data begin
@rsubset begin
# :PROFDAY > 5
:NOMTIME >= 0
!ismissing(:CONC) && !ismissing(:PD_CONC)
:EVID == 0
:CONC > 0 # Exclude pre-dose samples
end
end
dose_renamer = renamer(map(d -> d => "$(Int(d)) mg", unique(df_er_ss.DOSE)))
# Scatter plot with dose coloring
plt_er_ss = draw(
data(df_er_ss) *
mapping(
:CONC => "PK Concentration (ng/mL)",
:PD_CONC => "PD Response (IU/L)";
color=:DOSE => dose_renamer => "Dose"
) *
visual(Scatter, markersize=12, alpha=0.6);
axis=(;
xlabel="PK Concentration (ng/mL)",
ylabel="PD Response (IU/L)",
xscale=log10,
xtickformat=x -> string.(round.(x; digits=1)),
xticks=[1, 2, 5, 10, 20, 50, 100],
),
legend=(; position=:bottom),
figure=(; size=(1000, 800), fontsize=24)
)This plot reveals:
Binned Concentration-Response Analysis:
Binning concentrations and calculating mean PD response clarifies the central tendency obscured by individual variability.
# Create concentration bins and calculate mean PD
using Statistics
# Define bins on log scale
conc_bins = [0, 2, 5, 10, 20, 50, 100, 200]
df_er_ss_binned = @chain df_er_ss begin
@rtransform :CONC_bin = findfirst(x -> :CONC < x, conc_bins[2:end])
@by :CONC_bin begin
:CONC_mean = mean(:CONC)
:PD_mean = mean(:PD_CONC)
:PD_sem = std(:PD_CONC) / sqrt(length(:PD_CONC))
:n = length(:PD_CONC)
end
@rsubset !ismissing(:CONC_bin)
end
# Add binned statistics to the plot
plt_er_ss_binned = draw(
data(df_er_ss) *
mapping(
:CONC => "PK Concentration (ng/mL)",
:PD_CONC => "PD Response (IU/L)";
color=:DOSE => nonnumeric,# => dose_renamer => "Dose"
) *
visual(Scatter, markersize=10, alpha=0.4) +
data(df_er_ss_binned) *
mapping(
:CONC_mean => "PK Concentration (ng/mL)",
:PD_mean => "PD Response (IU/L)"
) *
visual(Scatter, color=:black, marker=:rect, markersize=16) +
data(df_er_ss_binned) *
mapping(
:CONC_mean => "PK Concentration (ng/mL)",
:PD_mean => "PD Response (IU/L)",
:PD_sem => "Standard Error"
) *
visual(Errorbars, color=:black, linewidth=3, whiskerwidth=5);
axis=(;
xlabel="PK Concentration (ng/mL)",
ylabel="PD Response (IU/L)",
xscale=log10,
xtickformat=x -> string.(round.(x; digits=1)),
xticks=[1, 2, 5, 10, 20, 50, 100],
),
legend=(; position=:bottom),
figure=(; size=(1000, 800), fontsize=24)
)
# save("./figures/fig5-errelation.png", plt_er_ss_binned; px_per_unit = 2)
plt_er_ss_binnedTemporal Stability of Exposure-Response:
Faceting by study day reveals whether the exposure-response relationship changes over time (tolerance or sensitization).
# Filter to all days with PK and PD observations
df_er_time = @chain wide_data begin
@rsubset begin
!ismissing(:CONC) && !ismissing(:PD_CONC)
:EVID == 0
:CONC > 0
end
end
day_renamer_pd = renamer(map(d -> d => "Day $(Int(d))", unique(df_er_time.PROFDAY)))
plt_er_by_day = draw(
data(df_er_time) *
mapping(
:CONC => "PK Concentration (ng/mL)",
:PD_CONC => "PD Response (IU/L)";
color=:DOSE => dose_renamer => "Dose",
layout=:PROFDAY => day_renamer_pd => "Study Day"
) *
visual(Scatter, markersize=10, alpha=0.6);
axis=(;
xlabel="PK Concentration (ng/mL)",
ylabel="PD Response (IU/L)",
xscale=log10,
xtickformat=x -> string.(round.(x; digits=1)),
),
legend=(; position=:bottom),
figure=(; size=(1400, 600), fontsize=24)
)This faceted view helps identify:
Hysteresis plots reveal temporal delays between concentration and effect by plotting PD versus PK as time-ordered trajectories.
Interpreting Hysteresis Patterns:
# Prepare data for hysteresis plots
df_hysteresis = @chain wide_data begin
@rsubset begin
!ismissing(:CONC) && !ismissing(:PD_CONC)
:EVID == 0
# :PROFDAY == 1 # Focus on steady state
:DOSE > 0
end
@orderby :ID :NOMTIME # Ensure temporal ordering
end
# Create subject-specific trajectories with arrows
# Note: For simplicity, we'll show a subset of subjects
subject_subset = unique(df_hysteresis.ID)[1:20]
plt_hysteresis = draw(
data(@rsubset(df_hysteresis, :ID ∈ subject_subset)) *
mapping(
:CONC => "PK Concentration (ng/mL)",
:PD_CONC => "PD Response (IU/L)";
color=:NOMTIME => "Time (hours)",
layout=:ID => renamer(map(d -> d => "ID: $(d)", subject_subset)),
) *
visual(Lines, linewidth=2);
axis=(;
xlabel="PK Concentration (ng/mL)",
ylabel="PD Response (IU/L)",
xscale=log10,
xtickformat=x -> string.(round.(x; digits=1)),
),
figure=(; size=(2000, 2000), fontsize=18)
)We investigate whether demographic characteristics modify the exposure-response relationship, which would inform covariate inclusion in PD modeling.
# Exposure-response stratified by sex
df_er_sex = @chain wide_data begin
@rsubset begin
:PROFDAY > 1
!ismissing(:CONC) && !ismissing(:PD_CONC)
:EVID == 0
:CONC > 0
:DOSE > 0
end
end
plt_er_by_sex = draw(
data(df_er_sex) *
mapping(
:CONC => "PK Concentration (ng/mL)",
:PD_CONC => "PD Response (IU/L)";
color=:SEX => "Sex",
col=:SEX => "Sex"
) *
visual(Scatter, markersize=12, alpha=0.6);
axis=(;
xlabel="PK Concentration (ng/mL)",
ylabel="PD Response (IU/L)",
xscale=log10,
xtickformat=x -> string.(round.(x; digits=1)),
),
legend=(; position=:bottom),
figure=(; size=(1400, 600), fontsize=24)
)This analysis helps determine:
Summary: Exploratory Data Analysis
In this section, we completed comprehensive visualization of PK and PD data:
In this section, we’ve completed Stages 1-2 of the workflow map:
Stage 1: Data Preparation
CSV.read()population / population_placebo: Population objects using read_pumas() for NLME modelingStage 2: Exploratory Data Analysis
overview_table() and table_one()Having explored the data visually and confirmed its quality, we begin quantitative analysis with non-compartmental methods. NCA provides model-independent exposure metrics that inform dose selection and serve as initial parameter estimates for subsequent population PK modeling.
Non-compartmental analysis estimates PK parameters without assuming a specific compartmental structure. Instead, it uses numerical integration (trapezoidal rule) and statistical moments to derive exposure metrics. Key advantages of NCA include:
Common NCA parameters include:
For multiple-dose studies, we also assess:
In the previous section, we already covered the the creation of a NCAPopulation and used that object to understand basic properties of the data such as profiles over day and across doses, extent of variability by visualizing spaghetti plots and finally accumulation from first dose to steady state via the group profiles over days across different doses. One thing that was noticed from the plots in the previous section was that rich concentration time profiles are only available on days 1 and 6 and the remaining days have only one sample. NCA is desirable over full profiles and hence we created an NCAPopulation using only PROFDAYs 1 and 6.
Next we directly jump into computation of the NCA parameters.
To calculate all standard NCA parameters, use run_nca():
nca_report = run_nca(population_nca)This returns an NCAReport object containing:
reportdf: DataFrame with all calculated parameters for each subjectnca_report.auc, nca_report.cmax, etc.nca_report_df = DataFrame(nca_report.reportdf)
first(nca_report_df, 10)| Row | id | PROFDAY | DOSE | dose | tlag | tmax | cmax | tlast | clast | clast_pred | auclast | kel | half_life | aucinf_obs | aucinf_pred | vz_f_obs | cl_f_obs | vz_f_pred | cl_f_pred | n_samples | cmax_dn | auclast_dn | aucinf_dn_obs | auc_extrap_obs | aucinf_dn_pred | auc_extrap_pred | aumclast | aumcinf_obs | aumc_extrap_obs | aumcinf_pred | aumc_extrap_pred | n_samples_kel | rsq_kel | rsq_adj_kel | corr_kel | intercept_kel | kel_t_low | kel_t_high | span | route |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| String | Float64 | Int64 | Float64 | Missing | Float64 | Float64 | Float64 | Float64 | Float64? | Float64 | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Int64 | Float64 | Float64 | Float64? | Float64? | Float64? | Float64? | Float64 | Float64? | Float64? | Float64? | Float64? | Int64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | String | |
| 1 | 100_1 | 1.0 | 100 | 100.0 | missing | 1.0 | 2.65 | 23.9 | 0.57 | 0.559479 | 21.2625 | 0.0263839 | 26.2716 | 42.8666 | 42.4678 | 88.4183 | 2.33282 | 89.2486 | 2.35472 | 9 | 0.0265 | 0.212625 | 0.428666 | 50.3985 | 0.424678 | 49.9327 | 195.892 | 1531.07 | 87.2055 | 1506.42 | 86.9962 | 3 | 0.960105 | 0.920211 | 0.97985 | 0.049825 | 12.0 | 23.9 | 0.45296 | EV |
| 2 | 100_10 | 1.0 | 100 | 100.0 | missing | 1.0 | 2.45 | 23.9 | 1.35 | 1.16566 | 35.773 | 0.0228455 | 30.3406 | 94.8656 | 86.7966 | 46.1414 | 1.05412 | 50.4309 | 1.15212 | 9 | 0.0245 | 0.35773 | 0.948656 | 62.2909 | 0.867966 | 58.7852 | 387.303 | 4386.23 | 91.17 | 3840.19 | 89.9145 | 6 | 0.59836 | 0.49795 | 0.773537 | 0.699294 | 2.0 | 23.9 | 0.721804 | EV |
| 3 | 100_2 | 1.0 | 100 | 100.0 | missing | 1.0 | 1.86 | 23.9 | 0.51 | 0.385273 | 18.7815 | 0.056208 | 12.3318 | 27.8549 | 25.6359 | 63.8704 | 3.59003 | 69.399 | 3.90078 | 9 | 0.0186 | 0.187815 | 0.278549 | 32.5739 | 0.256359 | 26.7376 | 166.798 | 545.08 | 69.3993 | 452.566 | 63.1439 | 6 | 0.809214 | 0.761517 | 0.899563 | 0.389567 | 2.0 | 23.9 | 1.77589 | EV |
| 4 | 100_3 | 1.0 | 100 | 100.0 | missing | 2.0 | 1.89 | 23.9 | 0.69 | 0.653744 | 23.674 | 0.0229665 | 30.1808 | 53.7178 | 52.1392 | 81.0565 | 1.86158 | 83.5107 | 1.91794 | 9 | 0.0189 | 0.23674 | 0.537178 | 55.929 | 0.521392 | 54.5946 | 230.034 | 2256.24 | 89.8045 | 2149.77 | 89.2996 | 4 | 0.853196 | 0.779793 | 0.923686 | 0.123859 | 8.0 | 23.9 | 0.526824 | EV |
| 5 | 100_4 | 1.0 | 100 | 100.0 | missing | 0.5 | 3.03 | 23.9 | 0.64 | 0.565717 | 23.898 | 0.0346612 | 19.9978 | 42.3625 | 40.2194 | 68.1044 | 2.36058 | 71.7334 | 2.48637 | 9 | 0.0303 | 0.23898 | 0.423625 | 43.5868 | 0.402194 | 40.5808 | 213.163 | 1187.18 | 82.0446 | 1074.13 | 80.1548 | 5 | 0.833651 | 0.778201 | 0.913045 | 0.258742 | 4.0 | 23.9 | 0.99511 | EV |
| 6 | 100_5 | 1.0 | 100 | 100.0 | missing | 1.0 | 2.13 | 23.9 | 0.71 | 0.703949 | 23.896 | 0.0129469 | 53.5378 | 78.7355 | 78.2682 | 98.099 | 1.27007 | 98.6847 | 1.27766 | 9 | 0.0213 | 0.23896 | 0.787355 | 69.6503 | 0.782682 | 69.4691 | 232.416 | 5778.82 | 95.9781 | 5731.55 | 95.945 | 4 | 0.988631 | 0.982946 | 0.994299 | -0.0416186 | 8.0 | 23.9 | 0.296986 | EV |
| 7 | 100_6 | 1.0 | 100 | 100.0 | missing | 2.0 | 1.01 | 23.9 | 0.43 | 0.355733 | 12.577 | 0.0293419 | 23.6231 | 27.2318 | 24.7007 | 125.151 | 3.67218 | 137.975 | 4.04847 | 9 | 0.0101 | 0.12577 | 0.272318 | 53.815 | 0.247007 | 49.0824 | 125.045 | 974.744 | 87.1715 | 827.989 | 84.8977 | 5 | 0.593508 | 0.458011 | 0.770395 | -0.332302 | 4.0 | 23.9 | 0.842396 | EV |
| 8 | 100_7 | 1.0 | 100 | 100.0 | missing | 1.0 | 2.12 | 23.9 | 0.62 | 0.598411 | 22.235 | 0.028518 | 24.3056 | 43.9756 | 43.2186 | 79.7386 | 2.27399 | 81.1353 | 2.31382 | 9 | 0.0212 | 0.22235 | 0.439756 | 49.4379 | 0.432186 | 48.5522 | 214.385 | 1496.33 | 85.6727 | 1451.69 | 85.2321 | 3 | 0.885965 | 0.77193 | 0.941257 | 0.168103 | 12.0 | 23.9 | 0.489599 | EV |
| 9 | 100_8 | 1.0 | 100 | 100.0 | missing | 0.5 | 3.45 | 23.9 | 0.35 | 0.336418 | 16.8565 | 0.0273938 | 25.3031 | 29.6331 | 29.1373 | 123.189 | 3.3746 | 125.285 | 3.43203 | 9 | 0.0345 | 0.168565 | 0.296331 | 43.116 | 0.291373 | 42.148 | 125.737 | 897.505 | 85.9903 | 867.555 | 85.5067 | 5 | 0.967016 | 0.956022 | 0.98337 | -0.43469 | 4.0 | 23.9 | 0.786465 | EV |
| 10 | 100_9 | 1.0 | 100 | 100.0 | missing | 0.5 | 2.04 | 23.9 | 1.01 | 0.802494 | 26.404 | 0.0303423 | 22.8443 | 59.6909 | 52.852 | 55.2133 | 1.6753 | 62.3577 | 1.89208 | 9 | 0.0204 | 0.26404 | 0.596909 | 55.7654 | 0.52852 | 50.0416 | 279.781 | 2172.38 | 87.121 | 1783.54 | 84.3132 | 7 | 0.578239 | 0.493887 | 0.76042 | 0.50515 | 1.0 | 23.9 | 1.00244 | EV |
Common parameters in the report include:
lambdaz: Terminal elimination rate constanthalf_life: Terminal half-life (t½)tmax: Time of maximum concentrationcmax: Maximum concentrationauclast: AUC from time zero to last measurable concentrationaucinf_obs: AUC from time zero to infinity (observed)vz_f_obs: Apparent volume of distribution (Vz/F)cl_f_obs: Apparent clearance (CL/F)You can specify which parameters to calculate:
nca_report_custom = run_nca(
population_nca;
parameters=["aucinf_obs", "auclast", "cmax", "tmax", "half_life"]
)See the Pumas NCA documentation for the full list of available parameters.
Dose proportionality analysis evaluates whether exposure increases linearly with dose. We use dose-normalized AUC and Cmax as primary metrics.
If PK is dose-proportional, dose-normalized AUC and Cmax should be independent of dose (i.e., constant across dose levels).
using AlgebraOfGraphics, CairoMakie
# Extract AUC and Cmax from NCA report, removing missing values
nca_summary = @select nca_report_df :id :PROFDAY :DOSE :cmax_dn :aucinf_dn_obs
dropmissing!(nca_summary)
fig_dose_prop = Figure(; size=(800, 800))
day_renamer = renamer(map(d -> d => "Day $(Int(d))", unique(nca_summary.PROFDAY)))
# AUC vs Dose
plt_auc_dose = data(nca_summary) *
mapping(:DOSE => nonnumeric, :aucinf_dn_obs;
layout=:PROFDAY => day_renamer) *
visual(Violin; show_median=true)
draw!(
fig_dose_prop[1, 1],
plt_auc_dose;
axis=(;
xlabel="Dose (mg)",
ylabel="Dose-Normalized AUC₀₋∞ (ng·hr/mL/mg)",
#title = "AUC Dose Proportionality",
),
)
# Cmax vs Dose
plt_cmax_dose = data(nca_summary) *
mapping(:DOSE => nonnumeric, :cmax_dn;
layout=:PROFDAY => day_renamer) *
visual(Violin; show_median=true)
draw!(
fig_dose_prop[2, 1],
plt_cmax_dose;
axis=(;
xlabel="Dose (mg)",
ylabel="Dose-Normalized Cmax (ng/mL/mg)",
# title = "Cmax Dose Proportionality",
),
)
fig_dose_propInterpretation:
For formal statistical testing, the power model is commonly used:
log(Parameter) ~ log(α) + β * log(Dose)
Where:
This analysis is typically performed using linear regression on log-transformed data. Below is the example just on cmax.
dp_cmax = NCA.DoseLinearityPowerModel(nca_report, :cmax)Dose Linearity Power Model
Variable: cmax
Model: log(cmax) ~ log(α) + β × log(dose)
────────────────────────────────────
Estimate low CI 95% high CI 95%
────────────────────────────────────
β 1.01617 0.921403 1.11094
────────────────────────────────────
Here’s a visualization for the dose linearity using a power model for cmax
In this section, we demonstrated NCA workflows for multiple ascending dose studies:
NCA Population Creation (ⓐ)
read_nca() to create NCAPopulation objectsDose-Normalized Concentration Analysis
Parameter Calculation (ⓐ)
run_nca() for comprehensive NCA reportsDose Proportionality Assessment
NCA provides rapid, model-independent PK characterization and can inform:
Having characterized exposure metrics via NCA, we now build mechanistic compartmental models to describe concentration-time data. This section demonstrates the complete model development workflow: base model fitting, diagnostics, model comparison, random effect refinement, covariate selection, and parameter uncertainty quantification.
This section covers multiple workflow stages:
Each stage builds on the previous, demonstrating iterative model development.
oral_model = @model begin
@metadata begin
desc = "One-compartment oral absorption model"
timeu = u"hr" # hour
end
@param begin
θka ∈ RealDomain(lower=0.1)
θcl ∈ RealDomain(lower=0.1)
θvc ∈ RealDomain(lower=1.0, upper=500)
Ω_pk ∈ PDiagDomain(3)
σ_add_pk ∈ RealDomain(lower=0.0)
σ_prop_pk ∈ RealDomain(lower=0.0)
end
@random begin
η ~ MvNormal(Ω_pk)
end
@pre begin
ηka, ηcl, ηvc = η
Ka = θka * exp(ηka)
CL = θcl * exp(ηcl)
Vc = θvc * exp(ηvc)
end
@dynamics Depots1Central1
@derived begin
ipred := @. Central / Vc
CONC ~ @. Normal(ipred, sqrt(σ_add_pk^2 + (ipred * σ_prop_pk)^2))
end
endPumasModel
Parameters: θka, θcl, θvc, Ω_pk, σ_add_pk, σ_prop_pk
Random effects: η
Covariates:
Dynamical system variables: Depot, Central
Dynamical system type: Closed form
Derived: CONC
Observed: CONC
We use NCA-derived values to inform initial estimates for clearance and volume parameters.
initial_est_oral_model = (
θka=0.9,
θcl=4.3,
θvc=252.0,
Ω_pk=Diagonal([0.1, 0.1, 0.1]),
σ_add_pk=sqrt(0.09),
σ_prop_pk=sqrt(0.01),
)(θka = 0.9,
θcl = 4.3,
θvc = 252.0,
Ω_pk = [0.1 0.0 0.0; 0.0 0.1 0.0; 0.0 0.0 0.1],
σ_add_pk = 0.3,
σ_prop_pk = 0.1,)
Before fitting, we validate that initial parameters yield reasonable log-likelihood and that no subjects are highly influential.
foce_loglikelihood = loglikelihood(oral_model, population, initial_est_oral_model, FOCE())-5311.612884011242
foce_fi = findinfluential(oral_model, population, initial_est_oral_model, FOCE())50-element Vector{@NamedTuple{id::String, nll::Float64}}:
(id = "1600_8", nll = 266.42908266386814)
(id = "1600_2", nll = 228.20685400233194)
(id = "1600_5", nll = 225.38105794476454)
(id = "1600_1", nll = 203.29683299466527)
(id = "800_6", nll = 202.6465624934891)
(id = "800_1", nll = 189.5454835861801)
(id = "800_7", nll = 188.6610414786809)
(id = "1600_9", nll = 184.77508968321845)
(id = "800_2", nll = 176.5944189955281)
(id = "1600_6", nll = 169.93898812529602)
⋮
(id = "200_2", nll = 43.37195275343636)
(id = "200_10", nll = 40.68007340784086)
(id = "100_3", nll = 38.53032098056971)
(id = "100_5", nll = 38.37424787341113)
(id = "200_5", nll = 36.651641107865856)
(id = "100_7", nll = 32.69390797435821)
(id = "100_2", nll = 29.7365955162019)
(id = "200_1", nll = 29.23535715676298)
(id = "100_6", nll = 7.419946574114995)
Simulate from the model using initial parameters to verify they generate plausible concentration profiles before fitting.
pre_fit_sim = simobs(oral_model, population, initial_est_oral_model; rng)Simulated population (Vector{<:Subject})
Simulated subjects: 50
Simulated variables: CONC
using PumasUtilities
using CairoMakie
sim_plot(pre_fit_sim,
color=(:grey, 0.5),
linewidth=0.5,
markercolor=(:black, 0.5),
axis=(yscale=Makie.pseudolog10,
xlabel="Time (hours)",
ylabel="Mean Concentration (ng/mL)",
xticks=0:24:240),
legend=(; position=:bottom))We use a two-stage approach: first fit using NaivePooled (fast) to get stable fixed effects, then refine with FOCE for full population model.
np_fit = fit(
oral_model,
population,
(; initial_est_oral_model..., Ω_pk=Diagonal([0.0, 0.0, 0.0])),
NaivePooled();
constantcoef=(:Ω_pk,),
optim_options=(; show_trace=false),
verbose=false,
)┌ Warning: Observation PD_CONC is not modeled by the model └ @ Pumas none:4570
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 50
Observation records: Active Missing
CONC: 1350 0
PD_CONC: 950 400
Total: 2300 400
Number of parameters: Constant Optimized
1 7
Likelihood approximation: NaivePooled
Likelihood optimizer: BFGS
Termination Reason: GradientNorm
Log-likelihood value: -3099.9729
------------------------
Estimate
------------------------
θka 6.7245
θcl 1.3155
θvc 85.528
† Ω_pk₁,₁ 0.0
† Ω_pk₂,₂ 0.0
† Ω_pk₃,₃ 0.0
σ_add_pk 0.015187
σ_prop_pk 0.44841
------------------------
† indicates constant parameters
foce_fit = fit(
oral_model,
population,
(; coef(np_fit)..., Ω_pk=initial_est_oral_model.Ω_pk),
FOCE();
optim_options=(; show_trace=false),
verbose=false,
)┌ Warning: Observation PD_CONC is not modeled by the model └ @ Pumas none:4570
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 50
Observation records: Active Missing
CONC: 1350 0
PD_CONC: 950 400
Total: 2300 400
Number of parameters: Constant Optimized
0 8
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: NoObjectiveChange
Log-likelihood value: -2589.0529
-----------------------
Estimate
-----------------------
θka 5.7837
θcl 1.3839
θvc 82.823
Ω_pk₁,₁ 1.2789e-7
Ω_pk₂,₂ 0.093668
Ω_pk₃,₃ 0.086028
σ_add_pk 0.015318
σ_prop_pk 0.28534
-----------------------
With the base model fitted, we compute comprehensive diagnostics to assess goodness-of-fit, identify systematic biases, and evaluate model adequacy.
foce_inspect = inspect(foce_fit)The resulting PumasInspect object contains:
Pumas provides several pre-defined diagnostic plots covering the most common goodness-of-fit evaluations.
goodness_of_fit(foce_inspect;
observations=[:CONC],
markercolor=(:grey, 0.5),
legend=(
position=:bottom,
framevisible=false,
labelsize=18,
patchsize=(20, 10),
),
figure=(size=(800, 800), fontsize=18))This plot shows:
From these plots, we can assess bias, heteroscedasticity, and systematic trends in residuals.
For detailed examination of specific diagnostic aspects, individual plotting functions provide focused views.
pk_diagnostics_kwargs = (observations=[:CONC],
markercolor=(:grey, 0.5),
legend=(
position=:bottom,
framevisible=false,
labelsize=13,
patchsize=(20, 10),))observations_vs_predictions(foce_inspect; pk_diagnostics_kwargs...)observations_vs_ipredictions(foce_inspect; pk_diagnostics_kwargs...)wresiduals_vs_time(foce_inspect; pk_diagnostics_kwargs...)Each function produces focused plots for specific diagnostics. Additional diagnostic functions like wresiduals_vs_predictions() and observations_vs_ipredictions() are detailed in Appendix XXX[C]XXX.
Visualize model fit for each subject to identify outliers and assess individual-level performance.
all_subject_fits = subject_fits(foce_inspect; pk_diagnostics_kwargs..., paginate = true, separate = true,)
all_subject_fits[4]This generates individual concentration-time plots overlaying observed data with individual predictions. The paginate = true option splits subjects across multiple pages, and separate = true gives each subject its own panel.
VPC is a simulation-based diagnostic comparing observed data quantiles to model-predicted quantile intervals. Agreement indicates adequate model predictive performance.
foce_vpc = vpc(foce_fit; covariates = [:tad])[ Info: Continuous VPC
Visual Predictive Check
Type of VPC: Continuous VPC
Simulated populations: 499
Subjects in data: 50
Stratification variable(s): None
Confidence level: 0.95
VPC lines: quantiles ([0.1, 0.5, 0.9])
The covariates = [:tad] argument bins observations by time after dose. The vpc() function simulates many replicates of the dataset and computes prediction intervals for the 5th, 50th, and 95th percentiles.
plt_vpc = vpc_plot(foce_vpc,
observations=true,
markercolor=:grey,
observed_linewidth=4,
figurelegend=(position=:b,
alignmode=Outside(),
orientation=:horizontal,
nbanks=3),
markersize=12,
axis=(yscale=Makie.pseudolog10,
xticks = 0:8:120,
yticks = 0:20:120,
ylabel="Concentration (ng/mL)",
xlabel="Time (hours)",
spinewidth=4,
rightspinevisible=false,
topspinevisible=false),
figure=(size=(1800, 1400), fontsize=40))
plt_vpcThe plot shows observed quantiles (lines/points) overlaid on simulated quantile prediction intervals (shaded regions). Deviations between observed and simulated quantiles indicate areas where the model may not adequately describe the data.
Diagnostic plots from the one-compartment model reveal systematic misfit in the distribution phase (early time points show under-prediction). This motivates exploration of a two-compartment structure.
The two-compartment model adds a peripheral compartment, requiring two additional parameters: inter-compartmental clearance (Q) and peripheral volume (Vp). The closed-form solution is Depots1Central1Periph1.
oral_model_2cmt = @model begin
@metadata begin
desc = "Two-compartment oral absorption model"
timeu = u"hr"
end
@param begin
θka ∈ RealDomain(lower = 0.01)
θcl ∈ RealDomain(lower = 0.01)
θvc ∈ RealDomain(lower = 0.01, upper = 190.0)
θq ∈ RealDomain(lower = 0.01, upper = 90.0)
θvp ∈ RealDomain(lower = 0.01, upper = 190.0)
Ω_pk ∈ PDiagDomain(5)
σ_add_pk ∈ RealDomain(lower = 0.0)
σ_prop_pk ∈ RealDomain(lower = 0.0)
end
@random begin
η ~ MvNormal(Ω_pk)
end
@pre begin
ηka, ηcl, ηvc, ηq, ηvp = η
Ka = θka * exp(ηka)
CL = θcl * exp(ηcl)
Vc = θvc * exp(ηvc)
Q = θq * exp(ηq)
Vp = θvp * exp(ηvp)
end
@dynamics Depots1Central1Periph1
@derived begin
ipred := @. Central / Vc
CONC ~ @. Normal(ipred, sqrt(σ_add_pk^2 + (ipred * σ_prop_pk)^2))
end
endPumasModel
Parameters: θka, θcl, θvc, θq, θvp, Ω_pk, σ_add_pk, σ_prop_pk
Random effects: η
Covariates:
Dynamical system variables: Depot, Central, Peripheral
Dynamical system type: Closed form
Derived: CONC
Observed: CONC
We initialize parameters using one-compartment estimates where applicable and add reasonable starting values for the new Q and Vp parameters.
initial_est_oral_model_2cmt = (
coef(foce_fit)...,
θq = 0.9,
θvp = 0.9,
Ω_pk = Diagonal([0.1, 0.1, 0.1, 0.1, 0.1]),
)
foce_loglikelihood = loglikelihood(
oral_model_2cmt,
population,
initial_est_oral_model_2cmt,
FOCE()
)-2589.1976148451167
Fit using FOCE and existing parameters set to previous fitted values:
foce_fit_2cmt = fit(
oral_model_2cmt,
population,
initial_est_oral_model_2cmt,
FOCE();
optim_options = (; show_trace = false),
verbose=false,
)Compute diagnostics and compare to one-compartment model to assess improvement.
foce_inspect_2cmt = inspect(foce_fit_2cmt)fig3_std2cmpt_gof = goodness_of_fit(foce_inspect_2cmt;
observations = [:CONC],
markercolor = (:grey, 0.5),
legend = (position = :bottom,
framevisible = false,
labelsize = 18,
patchsize = (20, 10),
),
figure = (size = (800, 800), fontsize = 18)
)
# save("./figures/fig3-std2cmpt-gof.png", fig3_std2cmpt_gof; px_per_unit = 2)
fig3_std2cmpt_gofThe diagnostics should show improved fit compared to the one-compartment model, particularly in the distribution phase.
Formal model comparison using parameter estimates and information criteria.
compare_estimates(one_compartment = foce_fit, two_compartment = foce_fit_2cmt)| Row | parameter | one_compartment | two_compartment |
|---|---|---|---|
| String | Float64? | Float64? | |
| 1 | θka | 5.78368 | 1.8091 |
| 2 | θcl | 1.38387 | 1.3014 |
| 3 | θvc | 82.8229 | 33.9426 |
| 4 | Ω_pk₁,₁ | 1.27888e-7 | 0.00122221 |
| 5 | Ω_pk₂,₂ | 0.0936679 | 0.17496 |
| 6 | Ω_pk₃,₃ | 0.0860281 | 0.202418 |
| 7 | σ_add_pk | 0.0153176 | 0.0150842 |
| 8 | σ_prop_pk | 0.285343 | 0.0980985 |
| 9 | θq | missing | 11.8143 |
| 10 | θvp | missing | 77.7894 |
| 11 | Ω_pk₄,₄ | missing | 0.0290994 |
| 12 | Ω_pk₅,₅ | missing | 0.09743 |
and the metrics_table() function can be used to compare model performance metrics such as AIC and BIC.
@chain begin
leftjoin(metrics_table(foce_fit),
metrics_table(foce_fit_2cmt), on = :Metric, makeunique=true)
rename(:Value => :One_Compartment,
:Value_1 => :Two_Compartment)
end| Row | Metric | One_Compartment | Two_Compartment |
|---|---|---|---|
| String | Any | Any | |
| 1 | Successful | true | true |
| 2 | Estimation Time | 2.028 | 7.092 |
| 3 | Subjects | 50 | 50 |
| 4 | Fixed Parameters | 0 | 0 |
| 5 | Optimized Parameters | 8 | 12 |
| 6 | CONC Active Observations | 1350 | 1350 |
| 7 | CONC Missing Observations | 0 | 0 |
| 8 | PD_CONC Active Observations | 950 | 950 |
| 9 | PD_CONC Missing Observations | 400 | 400 |
| 10 | Total Active Observations | 2300 | 2300 |
| 11 | Total Missing Observations | 400 | 400 |
| 12 | Likelihood Approximation | Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}} | Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}} |
| 13 | LogLikelihood (LL) | -2589.05 | -1351.96 |
| 14 | -2LL | 5178.11 | 2703.92 |
| 15 | AIC | 5194.11 | 2727.92 |
| 16 | BIC | 5240.03 | 2796.81 |
| 17 | (η-shrinkage) η₁ | 0.999 | 0.824 |
| 18 | (η-shrinkage) η₂ | 0.019 | -0.0 |
| 19 | (η-shrinkage) η₃ | 0.03 | 0.001 |
| 20 | (ϵ-shrinkage) CONC | 0.031 | 0.066 |
foce_vpc_2cmt = vpc(foce_fit_2cmt;
covariates = [:tad],
ensemblealg = EnsembleThreads())[ Info: Continuous VPC
Visual Predictive Check
Type of VPC: Continuous VPC
Simulated populations: 499
Subjects in data: 50
Stratification variable(s): None
Confidence level: 0.95
VPC lines: quantiles ([0.1, 0.5, 0.9])
plt_vpc_2cmt = vpc_plot(foce_vpc_2cmt,
observations=true,
markercolor=:grey,
observed_linewidth=4,
figurelegend=(position=:b,
alignmode=Outside(),
orientation=:horizontal,
nbanks=3),
markersize=12,
axis=(yscale=Makie.pseudolog10,
xticks = 0:8:120,
yticks = 0:20:120,
ylabel="Concentration (ng/mL)",
xlabel="Time (hours)",
spinewidth=4,
rightspinevisible=false,
topspinevisible=false),
figure=(size=(1800, 1400), fontsize=40))
# save("./figures/fig4-std2cmpt-vpc.png", plt_vpc_2cmt; px_per_unit = 2)
plt_vpc_2cmtWith satisfactory structural model identified (two-compartment), we evaluate random effect necessity. Diagnostics suggest minimal between-subject variability in Ka, motivating a simplified random effect structure.
oral_model_2cmt_noka = @model begin
@metadata begin
desc = "Two-compartment oral model without Ka random effect"
timeu = u"hr"
end
@param begin
θka ∈ RealDomain(lower = 0.01)
θcl ∈ RealDomain(lower = 0.01)
θvc ∈ RealDomain(lower = 0.01, upper = 190.0)
θq ∈ RealDomain(lower = 0.01, upper = 90.0)
θvp ∈ RealDomain(lower = 0.01, upper = 190.0)
Ω_pk ∈ PDiagDomain(4) # Now 4 instead of 5
σ_add_pk ∈ RealDomain(lower = 0.0)
σ_prop_pk ∈ RealDomain(lower = 0.0)
end
@random begin
η ~ MvNormal(Ω_pk)
end
@pre begin
ηcl, ηvc, ηq, ηvp = η # Only 4 random effects now
Ka = θka # No random effect
CL = θcl * exp(ηcl)
Vc = θvc * exp(ηvc)
Q = θq * exp(ηq)
Vp = θvp * exp(ηvp)
end
@dynamics Depots1Central1Periph1
@derived begin
ipred := @. Central / Vc
CONC ~ @. Normal(ipred, sqrt(σ_add_pk^2 + (ipred * σ_prop_pk)^2))
end
endPumasModel
Parameters: θka, θcl, θvc, θq, θvp, Ω_pk, σ_add_pk, σ_prop_pk
Random effects: η
Covariates:
Dynamical system variables: Depot, Central, Peripheral
Dynamical system type: Closed form
Derived: CONC
Observed: CONC
Fit the model:
foce_noka = fit(
oral_model_2cmt_noka,
population,
(coef(foce_fit_2cmt)..., Ω_pk = Diagonal([0.1, 0.1, 0.1, 0.1])),
FOCE();
optim_options = (; show_trace = false),
verbose=false,
)To compare parameter estimates between models, use compare_estimates():
compare_estimates(with_ka = foce_fit_2cmt, without_ka = foce_noka)| Row | parameter | with_ka | without_ka |
|---|---|---|---|
| String | Float64? | Float64? | |
| 1 | θka | 1.8091 | 1.80817 |
| 2 | θcl | 1.3014 | 1.30141 |
| 3 | θvc | 33.9426 | 33.9226 |
| 4 | θq | 11.8143 | 11.8179 |
| 5 | θvp | 77.7894 | 77.7969 |
| 6 | Ω_pk₁,₁ | 0.00122221 | 0.174929 |
| 7 | Ω_pk₂,₂ | 0.17496 | 0.201822 |
| 8 | Ω_pk₃,₃ | 0.202418 | 0.0297052 |
| 9 | Ω_pk₄,₄ | 0.0290994 | 0.0977277 |
| 10 | σ_add_pk | 0.0150842 | 0.0150817 |
| 11 | σ_prop_pk | 0.0980985 | 0.098142 |
| 12 | Ω_pk₅,₅ | 0.09743 | missing |
This produces a side-by-side table of parameter estimates, making it easy to see how removing the Ka random effect impacts other parameters.
For model selection, use information criteria:
bic_ka, bic_noka = bic(foce_fit_2cmt), bic(foce_noka)(2796.806076045644, 2789.0851311814536)
ll_ka, ll_noka = loglikelihood(foce_fit_2cmt), loglikelihood(foce_noka)(-1351.9590516113187, -1351.968911380182)
Lower BIC indicates better model fit penalized for complexity. If BIC improves (decreases) when removing the Ka random effect, the simpler model is preferred.
With structural and random effect models established, we investigate covariate relationships to explain residual between-subject variability and enable population extrapolation.
foce_inspect_2cmt_noka = inspect(foce_noka)The empirical_bayes_vs_covariates() function plots EBEs (individual random effects) against available covariates:
empirical_bayes_vs_covariates(foce_inspect_2cmt_noka;
covariates = [:WEIGHTB, :SEX, :DOSE],
categorical = [:SEX, :DOSE],
color = (:grey, 0.5))If EBEs show trends or correlations with covariates, it suggests the covariate may explain some of the unexplained variability currently captured by random effects. This guides covariate selection.
Covariate models should be coded such that setting the covariate parameter to zero (or a neutral value) removes the covariate effect. This makes model comparison straightforward.
For categorical covariates (e.g., sex), we use conditional logic to apply different parameter values to different categories. Consider an effect of sex on clearance:
Using if-elseif-else blocks:
@pre begin
COVsexCL = if SEX == "Female"
1 + θsexCLF # Effect for females
elseif SEX == "Male"
1 # Reference (no effect)
else
error("Expected SEX to be either \"Female\" or \"Male\" but the value was: $SEX")
end
CL = θcl * exp(η[1]) * COVsexCL
Vc = θvc * exp(η[2])
endUsing ternary expressions (more concise):
@pre begin
COVsexCL =
SEX == "Female" ? 1 + θsexCLF :
(SEX == "Male" ? 1 :
error("Expected SEX to be either \"Female\" or \"Male\" but the value was: $SEX"))
CL = θcl * exp(η[1]) * COVsexCL
Vc = θvc * exp(η[2])
endIn this parameterization, θsexCLF = 0 means no sex effect. A positive value increases clearance in females relative to males.
For continuous covariates (e.g., weight), we typically use power models centered at a reference value:
@pre begin
# Effect of body weight on Vc
COVwtVc = (WEIGHTB / 70)^θvcWEIGHTB
CL = θcl * exp(η[1])
Vc = θvc * exp(η[2]) * COVwtVc
endHere, θvcWEIGHTB = 0 removes the weight effect (since (WEIGHTB/70)^0 = 1). A value of 1 corresponds to allometric scaling (Vc ∝ weight).
Based on EBE diagnostics showing relationships between sex and CL, and between weight and Vc, we build a covariate model:
mdl_cov = @model begin
@metadata begin
desc = "Two-compartment model with covariates"
timeu = u"hr"
end
@param begin
θka ∈ RealDomain(lower = 0.01)
θcl ∈ RealDomain(lower = 0.01)
θvc ∈ RealDomain(lower = 0.01, upper = 190.0)
θq ∈ RealDomain(lower = 0.01, upper = 90.0)
θvp ∈ RealDomain(lower = 0.01, upper = 190.0)
θsexCLF ∈ RealDomain(lower = -10.0, upper = 10.0)
θvcWEIGHTB ∈ RealDomain(lower = 0.1, upper = 10.0)
Ω_pk ∈ PDiagDomain(4)
σ_add_pk ∈ RealDomain(lower = 0.0)
σ_prop_pk ∈ RealDomain(lower = 0.0)
end
@covariates SEX WEIGHTB
@random begin
η ~ MvNormal(Ω_pk)
end
@pre begin
ηcl, ηvc, ηq, ηvp = η
Ka = θka
COVsexCL =
SEX == "Female" ? 1 + θsexCLF :
(SEX == "Male" ? 1 :
error("Expected SEX to be either \"Female\" or \"Male\" but the value was: $SEX"))
CL = θcl * COVsexCL * exp(ηcl)
Vc = θvc * (WEIGHTB/77)^θvcWEIGHTB * exp(ηvc)
Q = θq * exp(ηq)
Vp = θvp * exp(ηvp)
end
@dynamics Depots1Central1Periph1
@derived begin
ipred := @. Central / Vc
CONC ~ @. Normal(ipred, sqrt(σ_add_pk^2 + (ipred * σ_prop_pk)^2))
end
endPumasModel
Parameters: θka, θcl, θvc, θq, θvp, θsexCLF, θvcWEIGHTB, Ω_pk, σ_add_pk, σ_prop_pk
Random effects: η
Covariates: SEX, WEIGHTB
Dynamical system variables: Depot, Central, Peripheral
Dynamical system type: Closed form
Derived: CONC
Observed: CONC
Note the @covariates declaration specifying which covariates the model uses.
Validate initial parameters:
initial_cov = (;
coef(foce_noka)...,
θsexCLF = 0.1,
θvcWEIGHTB = 1.0,
)
foce_loglikelihood = loglikelihood(mdl_cov, population, initial_cov, FOCE())-1358.9167442522842
We reuse parameters from the base model (foce_noka) and add neutral initial values for covariate effects. This approach minimizes disruption to the parameter space—covariate effects start small and are estimated from data.
When introducing covariates, set their initial values to “neutral” (zero for additive effects, one for multiplicative effects). This ensures that fixed effect parameters from the base model remain approximately valid. If you set a covariate parameter to a large initial value, you may need to adjust the corresponding fixed effect parameter to compensate.
Fit the model:
oral_model_covar_results = fit(
mdl_cov,
population,
initial_cov,
Pumas.FOCE();
optim_options = (show_trace = false,),
)Compare estimates:
compare_estimates(;foce_noka, oral_model_covar_results)| Row | parameter | foce_noka | oral_model_covar_results |
|---|---|---|---|
| String | Float64? | Float64? | |
| 1 | θka | 1.80817 | 1.80948 |
| 2 | θcl | 1.30141 | 1.90963 |
| 3 | θvc | 33.9226 | 33.8341 |
| 4 | θq | 11.8179 | 11.8087 |
| 5 | θvp | 77.7969 | 77.8592 |
| 6 | Ω_pk₁,₁ | 0.174929 | 0.03793 |
| 7 | Ω_pk₂,₂ | 0.201822 | 0.202339 |
| 8 | Ω_pk₃,₃ | 0.0297052 | 0.029917 |
| 9 | Ω_pk₄,₄ | 0.0977277 | 0.0982499 |
| 10 | σ_add_pk | 0.0150817 | 0.0150797 |
| 11 | σ_prop_pk | 0.098142 | 0.09812 |
| 12 | θsexCLF | missing | -0.522478 |
| 13 | θvcWEIGHTB | missing | 0.1 |
We expect to see:
θcl adjusts as sex effect is introduced)θsexCLF, θvcWEIGHTB) with confidence intervals not including neutral values indicate significant effectsDiagnostics:
oral_model_covar_inspect = inspect(oral_model_covar_results)Indeed, we see the expected change in parameter estimates
coeftable(oral_model_covar_results)| Row | parameter | constant | estimate |
|---|---|---|---|
| String | Bool | Float64 | |
| 1 | θka | false | 1.80948 |
| 2 | θcl | false | 1.90963 |
| 3 | θvc | false | 33.8341 |
| 4 | θq | false | 11.8087 |
| 5 | θvp | false | 77.8592 |
| 6 | θsexCLF | false | -0.522478 |
| 7 | θvcWEIGHTB | false | 0.1 |
| 8 | Ω_pk₁,₁ | false | 0.03793 |
| 9 | Ω_pk₂,₂ | false | 0.202339 |
| 10 | Ω_pk₃,₃ | false | 0.029917 |
| 11 | Ω_pk₄,₄ | false | 0.0982499 |
| 12 | σ_add_pk | false | 0.0150797 |
| 13 | σ_prop_pk | false | 0.09812 |
and in the plot of empirical bayes estimates against covariates
empirical_bayes_vs_covariates(oral_model_covar_inspect;
covariates = [:WEIGHTB, :SEX, :DOSE],
categorical = [:SEX, :DOSE],
color = (:grey, 0.5))After including covariates, the EBE vs. covariate plots should show reduced trends, indicating that covariates have successfully explained variability previously captured by random effects.
Pumas supports automated covariate selection via covariate_select(). This function performs stepwise forward/backward selection. However, automatic covariate selection is computationally expensive and requires a stable base model. It should be used judiciously and results should be validated with clinical and physiological knowledge. Full details are in the documentation at docs.pumas.ai.
Before finalizing our PK model, we quantify parameter uncertainty. This is essential for assessing statistical significance, evaluating model stability, and propagating uncertainty to simulations. This corresponds to Stages 9-10 in the workflow map.
The infer() function computes asymptotic standard errors using the sandwich estimator:
infer_covar = infer(oral_model_covar_results)
infer_table = coeftable(infer_covar)[ Info: Calculating: variance-covariance matrix. [ Info: Done.
| Row | parameter | constant | estimate | se | relative_se | ci_lower | ci_upper |
|---|---|---|---|---|---|---|---|
| String | Bool | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | θka | false | 1.80948 | 0.0620353 | 0.0342835 | 1.68789 | 1.93107 |
| 2 | θcl | false | 1.90963 | 0.0839004 | 0.0439354 | 1.74519 | 2.07407 |
| 3 | θvc | false | 33.8341 | 2.52024 | 0.0744884 | 28.8945 | 38.7736 |
| 4 | θq | false | 11.8087 | 0.435237 | 0.0368575 | 10.9556 | 12.6617 |
| 5 | θvp | false | 77.8592 | 3.58216 | 0.0460082 | 70.8383 | 84.8801 |
| 6 | θsexCLF | false | -0.522478 | 0.0272342 | 0.0521251 | -0.575856 | -0.4691 |
| 7 | θvcWEIGHTB | false | 0.1 | 0.872923 | 8.72923 | -1.6109 | 1.8109 |
| 8 | Ω_pk₁,₁ | false | 0.03793 | 0.0100196 | 0.264162 | 0.0182918 | 0.0575681 |
| 9 | Ω_pk₂,₂ | false | 0.202339 | 0.0365084 | 0.180432 | 0.130783 | 0.273894 |
| 10 | Ω_pk₃,₃ | false | 0.029917 | 0.0111367 | 0.372252 | 0.00808953 | 0.0517445 |
| 11 | Ω_pk₄,₄ | false | 0.0982499 | 0.0183905 | 0.187181 | 0.0622052 | 0.134295 |
| 12 | σ_add_pk | false | 0.0150797 | 0.00248389 | 0.164718 | 0.0102113 | 0.019948 |
| 13 | σ_prop_pk | false | 0.09812 | 0.00214825 | 0.0218941 | 0.0939095 | 0.10233 |
With a well-characterized PK model in hand, we now extend our analysis to pharmacodynamics (PD). Pharmacodynamic modeling links drug exposure (concentrations) to clinical or biological responses. In this section, we demonstrate multi-endpoint modeling, where both PK and PD observations are modeled simultaneously within a single framework. This approach corresponds to Stage 11 in the workflow map, specifically multi-endpoint modeling and sequential PKPD workflows.
We build an indirect response model where drug concentrations inhibit the degradation of a biomarker, with an effect compartment to capture hysteresis and a Hill coefficient for steep exposure-response. We demonstrate two approaches:
We model inhibition of kout using an Emax model with Hill coefficient and effect compartment:
Ce’ = Ke0 × (Conc - Ce)
INH = Imax × Ceγ / (IC50γ + Ceγ)
Response’ = kin - kout × (1 - INH) × Response
Where:
At baseline (Conc = Ce = 0), Response’ = 0 and Response = kin/kout.
We extend the PK model from the previous section to include PD dynamics. The model includes:
@derivedNote that we now use explicit ODEs in the @dynamics block rather than the name of a standard model. This is necessary because the PD dynamics (Response equation) must be coupled to the PK dynamics (Central compartment concentration).
mdl_pkpd = @model begin
@metadata begin
desc = "Simultaneous PKPD model with indirect response (Kout inhibition, effect compartment, Hill coefficient)"
timeu = u"hr"
end
@param begin
# PK parameters (from covariate model)
θka ∈ RealDomain(lower=0.01)
θcl ∈ RealDomain(lower=0.01)
θvc ∈ RealDomain(lower=0.01, upper=190.0)
θq ∈ RealDomain(lower=0.01, upper=90.0)
θvp ∈ RealDomain(lower=0.01, upper=190.0)
θsexCLF ∈ RealDomain(lower=-10.0, upper=10.0)
θvcWEIGHTB ∈ RealDomain(lower=0.01, upper=190.0)
Ω_pk ∈ PDiagDomain(4)
σ_add_pk ∈ RealDomain(lower=0.0)
σ_prop_pk ∈ RealDomain(lower=0.0)
# PD parameters - Kout inhibition with effect compartment and Hill coefficient
tvkin ∈ RealDomain(lower=0.001, upper=30)
tvkout ∈ RealDomain(lower=0.001, upper=100.0)
tvic50 ∈ RealDomain(lower=0.01, upper=100)
tvimax ∈ RealDomain(lower=0.0, upper=1.0)
tvke0 ∈ RealDomain(lower=0.001, upper=10.0)
tvgamma ∈ RealDomain(lower=0.1, upper=10.0)
Ωpd ∈ PDiagDomain(2)
σ_add_pd ∈ RealDomain(lower=0.0)
σ_prop_pd ∈ RealDomain(lower=0.0)
end
@covariates SEX WEIGHTB
@random begin
η ~ MvNormal(Ω_pk)
ηpd ~ MvNormal(Ωpd)
end
@pre begin
# PK parameters
Ka = θka
COVsexCL = if SEX == "Female"
1 + θsexCLF
elseif SEX == "Male"
1
else
error("Expected SEX to be either \"Female\" or \"Male\" but the value was: $SEX")
end
CL = θcl * COVsexCL * exp(η[1])
Vc = θvc * (WEIGHTB / 77)^θvcWEIGHTB * exp(η[2])
Q = θq * exp(η[3])
Vp = θvp * exp(η[4])
# PD parameters - Kout inhibition with effect compartment
kin = tvkin
kout = tvkout
ic50 = tvic50 * exp(ηpd[1])
imax = tvimax
Ke0 = tvke0 * exp(ηpd[2])
gamma = tvgamma
end
@vars begin
Conc = Central / Vc
Ce_safe = max(Ce, 0.0)
INH = imax * Ce_safe^gamma / (ic50^gamma + Ce_safe^gamma)
end
@init begin
Response = kin / kout
Ce = 0.0
end
@dynamics begin
Depot' = -Ka * Depot
Central' = Ka * Depot - (CL + Q) / Vc * Central + Q / Vp * Peripheral
Peripheral' = Q / Vc * Central - Q / Vp * Peripheral
# Effect compartment introduces delay for hysteresis
Ce' = Ke0 * (Conc - Ce)
# Kout inhibition with Hill coefficient for steep exposure-response
Response' = kin - kout * (1 - INH) * Response
end
@derived begin
conc_model := @. Central / Vc
CONC ~ @. Normal(conc_model, sqrt(σ_add_pk^2 + (conc_model * σ_prop_pk)^2))
PD_CONC ~ @. Normal(Response, sqrt(σ_add_pd^2 + (Response * σ_prop_pd)^2))
end
endPumasModel
Parameters: θka, θcl, θvc, θq, θvp, θsexCLF, θvcWEIGHTB, Ω_pk, σ_add_pk, σ_prop_pk, tvkin, tvkout, tvic50, tvimax, tvke0, tvgamma, Ωpd, σ_add_pd, σ_prop_pd
Random effects: η, ηpd
Covariates: SEX, WEIGHTB
Dynamical system variables: Depot, Central, Peripheral, Ce, Response
Dynamical system type: Nonlinear ODE
Derived: CONC, PD_CONC, Conc, Ce_safe, INH
Observed: CONC, PD_CONC, Conc, Ce_safe, INH
Multiple @random distributions: We have separate random effect vectors for PK (η) and PD (ηpd). This allows PK and PD random effects to be independent.
@vars block: Defines intermediate variables (here, Conc) that can be used in the @dynamics block. This improves code readability by avoiding repeated expressions.
Explicit ODEs: The @dynamics block now contains explicit differential equations rather than the name of a standard model. The PD equation depends on Conc, which couples PK and PD dynamics.
Multiple observations in @derived: Both CONC and PD_CONC are specified, each with its own error model. Pumas will match these to the corresponding observation columns in the data.
We reuse PK parameters from the final covariate model and add initial PD parameter values:
init_θ_pkpd = (
coef(oral_model_covar_results)...,
# PD parameters - Kout inhibition with effect compartment and Hill coefficient
tvkin=1.5, # Production rate (baseline = kin/kout = 50)
tvkout=0.03, # Degradation rate constant
tvic50=12.0, # IC50 for inhibition (mid-range of concentrations)
tvimax=0.9, # Maximum inhibition (90%)
tvke0=0.08, # Effect compartment rate constant (creates hysteresis)
tvgamma=1.5, # Hill coefficient (moderate exposure-response steepness)
Ωpd=Diagonal([0.04, 0.04]), # Variability on IC50 and Ke0
σ_add_pd=0.5, # Additive error (scaled for higher response values)
σ_prop_pd=0.05, # Proportional error
)The ... operator unpacks all parameters from the PK model fit, and we append PD-specific parameters.
Before fitting, simulate to check whether PD initial parameters are reasonable:
sim_pd = simobs(mdl_pkpd, population, init_θ_pkpd; rng)
sim_plot(sim_pd; observations=[:PD_CONC],
markercolor=:grey,
axis=(ylabel="PD Concentration", xlabel="Time (hours)"),
figure=(size=(800, 600), fontsize=18),
legend=(position=:bottom, framevisible=false, labelsize=14, nbanks=1),)This plot shows simulated PD responses at initial parameter values. If the simulated profiles are wildly inconsistent with observed PD data, adjust initial values before fitting.
For the simultaneous approach, we estimate all parameters together. However, since we have high confidence in the PK parameter estimates from the previous section, we can fix the PK parameters and estimate only PD parameters. This is done using the constantcoef argument:
pd_ft = fit(
mdl_pkpd,
population,
init_θ_pkpd,
FOCE();
constantcoef=keys(coef(oral_model_covar_results)),
optim_options=(; show_trace=false,),
)The constantcoef argument takes a collection of parameter names to hold fixed during optimization. Here, keys(coef(oral_model_covar_results)) returns all PK parameter names, so only PD parameters (tvkin, tvkout, tvic50, tvimax, tvke0, tvgamma, Ωpd, σ_add_pd, σ_prop_pd) are estimated.
Simultaneous estimation (removing constantcoef) would re-estimate all PK and PD parameters jointly. This approach:
Sequential estimation (using constantcoef to fix PK parameters) is appropriate when:
For this tutorial, we use the sequential approach within a simultaneous model framework by fixing PK parameters. This is just one of many sequential approaches that can be used.
Compute diagnostics:
pd_ins = inspect(pd_ft)Goodness-of-fit for PD observations:
goodness_of_fit(pd_ins; observations=[:PD_CONC],
markercolor=(:grey, 0.5),
legend=(
position=:bottom,
framevisible=false,
labelsize=18,
patchsize=(20, 10),
),
figure=(size=(800, 800), fontsize=18))An alternative to the approach above is true sequential modeling, where:
This approach completely separates PK and PD estimation and is useful when PK and PD data come from different studies or sampling schemes.
From the PK model fit, we can extract individual parameter estimates. One approach is to simulate from the PK model and use the simulated data to construct a new population:
sim_df = DataFrame(pd_ins)Alternatively, you could extract EBEs directly from the inspect() object and merge them with the dataset.
We construct a new population for PD modeling where individual CL and Vc are covariates:
pop_pd = read_pumas(
sim_df;
id=:id,
time=:time,
observations=[:PD_CONC],
covariates=[:Ka, :CL, :Vc, :Q, :Vp, :DOSE,],
# amt, evid, cmt as appropriate for dosing
)Population
Subjects: 50
Covariates: Ka, CL, Vc, Q, Vp, DOSE
Observations: PD_CONC
The PD model no longer estimates PK parameters. Instead, it uses individual CL and Vc as covariates:
pk_seq_pd = @model begin
@metadata begin
desc = "Sequential PD model with PK parameter covariates (Kout inhibition, effect compartment, Hill coefficient)"
timeu = u"hr"
end
@covariates CL Vc Q Vp Ka DOSE
@param begin
# PD parameters - Kout inhibition with effect compartment and Hill coefficient
tvkin ∈ RealDomain(lower=0.001, upper=30)
tvkout ∈ RealDomain(lower=0.001, upper=100.0)
tvic50 ∈ RealDomain(lower=0.01, upper=100)
tvimax ∈ RealDomain(lower=0.0, upper=1.0)
tvke0 ∈ RealDomain(lower=0.001, upper=10.0)
tvgamma ∈ RealDomain(lower=0.1, upper=10.0)
Ωpd ∈ PDiagDomain(2)
σ_add_pd ∈ RealDomain(lower=0.0)
σ_prop_pd ∈ RealDomain(lower=0.0)
end
@random begin
ηpd ~ MvNormal(Ωpd)
end
@pre begin
# PD parameters - Kout inhibition with effect compartment
kin = tvkin
kout = tvkout
ic50 = tvic50 * exp(ηpd[1])
imax = tvimax
Ke0 = tvke0 * exp(ηpd[2])
gamma = tvgamma
end
@vars begin
Conc = Central / Vc
Ce_safe = max(Ce, 0.0)
INH = imax * Ce_safe^gamma / (ic50^gamma + Ce_safe^gamma)
end
@init begin
Response = kin / kout
Ce = 0.0
end
@dynamics begin
Depot' = -Ka * Depot
Central' = Ka * Depot - (CL + Q) / Vc * Central + Q / Vp * Peripheral
Peripheral' = Q / Vc * Central - Q / Vp * Peripheral
# Effect compartment introduces delay for hysteresis
Ce' = Ke0 * (Conc - Ce)
# Kout inhibition with Hill coefficient for steep exposure-response
Response' = kin - kout * (1 - INH) * Response
end
@derived begin
PD_CONC ~ @. Normal(Response, sqrt(σ_add_pd^2 + (Response * σ_prop_pd)^2))
end
end
init_θ_pkpd_seq = (
# PD parameters - Kout inhibition with effect compartment and Hill coefficient
tvkin=1.5, # Production rate (baseline = kin/kout = 50)
tvkout=0.03, # Degradation rate constant
tvic50=12.0, # IC50 for inhibition (mid-range of concentrations)
tvimax=0.9, # Maximum inhibition (90%)
tvke0=0.08, # Effect compartment rate constant (creates hysteresis)
tvgamma=1.5, # Hill coefficient (moderate exposure-response steepness)
Ωpd=Diagonal([0.04, 0.04]), # Variability on IC50 and Ke0
σ_add_pd=0.5, # Additive error (scaled for higher response values)
σ_prop_pd=0.05, # Proportional error
)┌ Warning: Covariate DOSE is not used in the model. └ @ Pumas none:3167
(tvkin = 1.5,
tvkout = 0.03,
tvic50 = 12.0,
tvimax = 0.9,
tvke0 = 0.08,
tvgamma = 1.5,
Ωpd = [0.04 0.0; 0.0 0.04],
σ_add_pd = 0.5,
σ_prop_pd = 0.05,)
This model uses inhibition of Kout with an effect compartment and Hill coefficient, matching the simultaneous model structure.
ft_seq = fit(pk_seq_pd, pop_pd, init_θ_pkpd_seq, FOCE();
optim_options=(; show_trace=false), verbose=false)This fit estimates only PD parameters, treating individual PK parameters as known.
The estimates should generally be the same, but some numerical differences is to be expected.
compare_estimates(; pd_ft, ft_seq)| Row | parameter | pd_ft | ft_seq |
|---|---|---|---|
| String | Float64? | Float64? | |
| 1 | tvkin | 1.50157 | 1.49754 |
| 2 | tvkout | 0.0300331 | 0.0299498 |
| 3 | tvic50 | 12.2191 | 12.1899 |
| 4 | tvimax | 0.9143 | 0.914112 |
| 5 | tvke0 | 0.0794148 | 0.0797224 |
| 6 | tvgamma | 1.4131 | 1.41748 |
| 7 | Ωpd₁,₁ | 0.0465215 | 0.0461238 |
| 8 | Ωpd₂,₂ | 0.00970952 | 8.5971e-8 |
| 9 | σ_add_pd | 0.00594804 | 0.00124218 |
| 10 | σ_prop_pd | 0.0509662 | 0.05096 |
| 11 | θka | 1.80948 | missing |
| 12 | θcl | 1.90963 | missing |
| 13 | θvc | 33.8341 | missing |
| 14 | θq | 11.8087 | missing |
| 15 | θvp | 77.8592 | missing |
| 16 | θsexCLF | -0.522478 | missing |
| 17 | θvcWEIGHTB | 0.1 | missing |
| 18 | Ω_pk₁,₁ | 0.03793 | missing |
| 19 | Ω_pk₂,₂ | 0.202339 | missing |
| 20 | Ω_pk₃,₃ | 0.029917 | missing |
| 21 | Ω_pk₄,₄ | 0.0982499 | missing |
| 22 | σ_add_pk | 0.0150797 | missing |
| 23 | σ_prop_pk | 0.09812 | missing |
Advantages:
Disadvantages:
For most analyses with rich PK and PD data from the same study, the simultaneous approach (with or without constantcoef) is preferred. Sequential modeling is most valuable when data sources differ or when computational constraints are severe.
Consider sequential PKPD when:
For comprehensive final analyses, simultaneous estimation is generally recommended to properly account for uncertainty.
In this section, we extended our PK model to include pharmacodynamic endpoints, demonstrating multi-endpoint modeling approaches:
Simultaneous PKPD Modeling (✓)
@dynamics to couple PK and PD@derived (CONC and PD_CONC)constantcoef to estimate only PD parametersSequential PKPD Modeling
With a validated PKPD model, we’re now ready to proceed to simulation for decision support in the next section, where we’ll evaluate alternative dosing regimens and assess target attainment metrics to support dosing decisions.
In the next section, we will:
DosageRegimen()simobs()This represents the culmination of the pharmacometric workflow—translating model-based understanding into actionable clinical recommendations.
Pharmacometrics is inherently different from clinical statistics or biostatistics and one big difference is the focus on model building in pharmacometrics. Models are useful because they allow us to simulate scenarios to answer counter factual questions. This can be used to design clinical trials, apply for clinical trial waivers and more.
We will show how simulations can be implemented in Pumas to evaluate six alternative dosage regimens: 0 (placebo), 100, 200, 400, 800, or 1600 mg PO daily for 14 days. We are interested in three metrics.
The probability of obtaining a PD_CONC within the therapeutic range (80-150 units) at any time during treatment. The time needed to reach the first therapeutic PD_CONC value. The total time spent in the TR as a percentage of the dosing interval (i.e., 24 hours) at SS (Day 14).
The estimated population half-life for warfarin per our model is ~41 hours which means it should take roughly 9 days (on average) to achieve steady-state; we extend this to 14 days to ensure each of our virtual subjects is at SS prior to evaluation. We will generate a Population of 100 Subjects and use it simulate a total of 600 trials (200 per dosage regimen).
We will not include RUV, since most variability comes from BSV and RUV can make results difficult to interpret.
tr_low, tr_high = 75, 125(75, 125)
Layer
transformation: identity
data: Nothing
positional:
named:
color: :DOSE => (AlgebraOfGraphics.Renamer{Vector{Int64}, Vector{String}}([0, 100, 200, 400, 800, 1600], ["0 mg", "100 mg", "200 mg", "400 mg", "800 mg", "1600 mg"]) => "")
col: :DOSE => AlgebraOfGraphics.Renamer{Vector{Int64}, Vector{String}}([0, 100, 200, 400, 800, 1600], ["0 mg", "100 mg", "200 mg", "400 mg", "800 mg", "1600 mg"])
# combine layers and draw
fig6_pta = (data(_tbl) * (pi_layer + median_layer) * cf_map) + tr_layer |> draw(
scales(;
Y=(; label="Biomarker level"),
X=(; label="Time after previous dose, hours"),
secondary=(; palette=[:gray30]),
);
figure=(;
size=(6, 4) .* 96,
title="Predicted Biomarker Profiles at Dose Number 6 ",
subtitle="Median (line), 90%PI (band), TR (dash-line)",
titlealign=:left,
),
axis=(;
limits=(0, nothing, 0, nothing),
xticks=0:6:24,
xlabelpadding=10,
# yticks = 0:20:80,
),
legend=(; orientation=:horizontal, framevisible=false, position=:bottom, nbanks=2, labelsize=10,),
)
# save("./figures/fig6-pta.png", fig6_pta, px_per_unit=2)
fig6_pta| Row | DOSE | rep_id | ta | tta | ttr |
|---|---|---|---|---|---|
| Int64? | Int64 | Float64 | Float64? | Float64? | |
| 1 | 0 | 1 | 0.0 | missing | missing |
| 2 | 100 | 1 | 0.0 | missing | missing |
| 3 | 200 | 1 | 0.0 | missing | missing |
| 4 | 400 | 1 | 50.0 | 120.1 | 99.1667 |
| 5 | 800 | 1 | 90.0 | 120.1 | 99.1667 |
| 6 | 1600 | 1 | 80.0 | 120.1 | 57.7083 |
| 7 | 0 | 2 | 0.0 | missing | missing |
| 8 | 100 | 2 | 0.0 | missing | missing |
| 9 | 200 | 2 | 0.0 | missing | missing |
| 10 | 400 | 2 | 90.0 | 124.933 | 79.0278 |
| 11 | 800 | 2 | 100.0 | 120.1 | 94.2083 |
| 12 | 1600 | 2 | 60.0 | 120.1 | 90.9028 |
| 13 | 0 | 3 | 0.0 | missing | missing |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 589 | 0 | 99 | 0.0 | missing | missing |
| 590 | 100 | 99 | 0.0 | missing | missing |
| 591 | 200 | 99 | 0.0 | missing | missing |
| 592 | 400 | 99 | 60.0 | 122.083 | 90.9028 |
| 593 | 800 | 99 | 100.0 | 120.1 | 99.1667 |
| 594 | 1600 | 99 | 70.0 | 120.1 | 66.0119 |
| 595 | 0 | 100 | 0.0 | missing | missing |
| 596 | 100 | 100 | 0.0 | missing | missing |
| 597 | 200 | 100 | 0.0 | missing | missing |
| 598 | 400 | 100 | 50.0 | 124.46 | 81.0 |
| 599 | 800 | 100 | 100.0 | 122.48 | 89.25 |
| 600 | 1600 | 100 | 80.0 | 120.1 | 47.3958 |
We evaluate three key efficacy metrics across dose levels:
table_one(
dfproccessed,
[
:ta => war_metrics => "Probability of TA",
:tta => war_metrics => "Time to Target",
:ttr => war_metrics => "Time in TR",
],
sort=false,
groupby=:DOSE => "Dose, mg",
show_total=false,
)| Dose, mg | ||||||
| 0 | 100 | 200 | 400 | 800 | 1600 | |
| Probability of TA | ||||||
| Median | 0 | 0 | 0 | 60 | 100 | 70 |
| 95% CI | [0, 0] | [0, 0] | [0, 10] | [30, 80] | [80, 100] | [50, 90] |
| Time to Target | ||||||
| Median | - | - | 133 | 124 | 120 | 120 |
| 95% CI | - | - | [120, 143] | [120, 130] | [120, 123] | [120, 120] |
| Time in TR | ||||||
| Median | - | - | 45.4 | 83.4 | 94.2 | 69.9 |
| 95% CI | - | - | [1.84, 99.2] | [56.8, 99.2] | [80.2, 99.2] | [48.7, 92.9] |
These simulation results support dose selection by:
The final dose selection should integrate these model-based predictions with clinical judgment, safety data, and practical considerations (e.g., formulation constraints, dosing convenience).
Pumas is available through multiple deployment options to suit different organizational needs and computational requirements.
For individual users and small teams, Pumas Desktop provides a complete local installation with VS Code integration. This option is ideal for:
Installation instructions: https://docs.pumas.ai/stable/platformdetails/desktop/
For teams requiring shared computational resources and collaborative workflows, Pumas Cluster provides a managed high-performance computing environment. This option supports:
Setup and configuration: https://docs.pumas.ai/stable/platformdetails/pumascluster/
For organizations requiring enterprise-grade security, scalability, and support, PumasAI offers a fully managed cloud platform with:
For enterprise inquiries: Contact sales@pumas.ai
The Pumas ecosystem provides comprehensive resources for learning, troubleshooting, and staying current with platform developments.